Machine learning analysis predicts a person’s sex based on mechanical but not thermal pain thresholds

Sex differences in pain perception have been extensively studied, but precision medicine applications such as sex-specific pain pharmacology have barely progressed beyond proof-of-concept. A data set of pain thresholds to mechanical (blunt and punctate pressure) and thermal (heat and cold) stimuli applied to non-sensitized and sensitized (capsaicin, menthol) forearm skin of 69 male and 56 female healthy volunteers was analyzed for data structures contingent with the prior sex structure using unsupervised and supervised approaches. A working hypothesis that the relevance of sex differences could be approached via reversibility of the association, i.e., sex should be identifiable from pain thresholds, was verified with trained machine learning algorithms that could infer a person's sex in a 20% validation sample not seen to the algorithms during training, with balanced accuracy of up to 79%. This was only possible with thresholds for mechanical stimuli, but not for thermal stimuli or sensitization responses, which were not sufficient to train an algorithm that could assign sex better than by guessing or when trained with nonsense (permuted) information. This enabled the translation to the molecular level of nociceptive targets that convert mechanical but not thermal information into signals interpreted as pain, which could eventually be used for pharmacological precision medicine approaches to pain. By exploiting a key feature of machine learning, which allows for the recognition of data structures and the reduction of information to the minimum relevant, experimental human pain data could be characterized in a way that incorporates "non" logic that could be translated directly to the molecular pharmacological level, pointing toward sex-specific precision medicine for pain.

Sex specific differences in pain perception have been extensively studied and reviewed [1][2][3] . A literature search of the PubMed database at https:// pubmed. ncbi. nlm. nih. gov/ on November 6, 2022 yielded 22,322 hits, with increasing total number and relative number to publications on pain (Fig. 1). The earliest hit in this search was a paper on sex differences in the apperceptive background of pain experience published in 1959 4 , in which it was already mentioned that "females were generally predisposed to see more pain-both Intensity and duration". Thus, the sex difference in pain seems to be well studied, and the higher sensitivity of females seems to be the most common finding.
However, statistical significance does not equate with relevance. In the present analyses a first working hypothesis for addressing the relevance of observed sex differences was the reversibility of the association. If pain thresholds differ to a relevant degree between sexes, then the mutual association of pain with sex should be reversible. It should be possible to learn sex from the pain thresholds of a given sample of men and women and then generalize this skill to infer sex from the pain thresholds of new subjects. This formulation of the question of sex differences in pain is consistent with the definition of machine learning 8 . Having transferred the sex problem from statistics to machine learning, a second hypothesis arises that considers the need, typical of machine learning, to filter out the "needles" from the "haystack", i.e., to (only) use the information relevant to the problem from complex data sets. The ability to learn sex from pain thresholds may not require or be possible with every type of pain threshold, but may be (only) possible for certain types of pain thresholds. Thus, a second working hypothesis was that sex differences in pain thresholds could be used for future precision medicine approaches based on the difference in information content of thresholds for specific stimuli to infer sex. Ideally a logical "NOT" connection should be identified in the sense that sex would be related to pain threshold for one type of Research interests in sex aspects of pain as reflected by publication activity. Results of a computeraided literature search of the PubMed database at https:// pubmed. ncbi. nlm. nih. gov/ on November 6, 2022, using the R package "RISmed" (https:// cran.r-proje ct. org/ packa ge= RISmed 5 ) and querying the string "pain AND ((sex OR gender) AND (differences OR difference)) NOT review[PT]". (A) Stacked bar chart of all publications listed in PubMed per year, with special emphasis on publications mentioning "sex"/ "gender" along with "pain" compared to all publications mentioning "pain". (B) Line graph of relative number of publications on gender/ sex differences in pain compared to all publications on pain. The figure has been created using the R software package (version 4.2.2 for Linux; http:// CRAN.R-proje ct. org/ 6 ) and the library "ggplot2″ (https:// cran.r-proje ct. org/ packa ge= ggplo t2 7  www.nature.com/scientificreports/ The study assessed pain thresholds defined as "the least experience of pain which a subject can recognize" (http:// www. iasp-pain. org) and implemented in the study as the physical strength at which the answer to the question "does it hurt?" changes from "no" to "yes", which is why the units of pain thresholds are the units of physical strength of the applied stimuli. Thermal stimuli comprised heat and cold. Heat stimuli were applied using a 3 × 3 cm thermode (Thermal Sensory Analyzer, Medoc Advanced Medical Systems Ltd., Ramat Yishai, Israel) placed onto the skin of the left volar forearm. Its temperature was increased from 32 °C by 0.3 °C/s until the subject pressed a button at the first sensation of pain, which triggered cooling of the thermode by approximately 1.2 °C/s. Heat stimuli were applied eight times at random intervals of 25-35 s. The median of the last five responses was defined as the heat pain threshold because in previous experiments a plateau was reached after the first three measurements. Cold stimuli were administered in a similar manner, except that the temperature of the thermode decreased by 1 °C/s from 32 to 0 °C. Five repetitions were used. The cut-off temperatures of the stimulation device were 0 °C and 50 °C. Mechanical stimuli comprised blunt and punctate pressure. The former was exerted perpendicularly onto the dorsal side of mid-phalanx of the right middle finger using a pressure algometer with a circular and flat probe of 1 cm diameter (JTECH Medical, Midvale, USA). The pressure was increased at a rate of approximately 9 N/cm 2 per second until the subject reported pain. The procedure was repeated five times every 30 s. Punctate pressure was exerted onto the left volar forearm using von Frey hairs (0.008, 0.02, 0.04, 0.07, 0.16, 0.4, 0.6, 1, 1.4, 2, 4, 6, 8, 10, 15, 26, 60, 100, 180, 300 g; North Coast Medical Inc., Morgan Hill, CA, USA). Electrical stimuli were applied using a constant current device (Neurometer ® CPT, Neurotron Inc., Baltimore, MD). It delivered sine-wave stimuli at 5 Hz applied via two gold electrodes placed on the medial and lateral side of the mid-phalanx of the right middle finger. Their intensity was increased from 0 to 20 mA by 0.2 mA/s until the subjects interrupted the current by releasing a button. Measurements were repeated five times every 30 s. Sensitization was assessed with punctate mechanical and heat 11 stimuli and obtained using capsaicin cream (0.1 g, 0.1%, manufactured by the local pharmacy) applied onto a 3 × 3 cm skin area on the left volar forearm and covered with a plaster for 20 min. For cold stimuli 12 a menthol solution (2 ml of a 40% menthol solution dissolved in ethanol) was used instead of capsaicin creme. Data analysis. The goal of the data analysis was to (1) identify structure in the pain threshold data reflecting the prior classification of study participants by sex, (2) to determine which variables among pain threshold drive this structure, and (3) to translate this into a molecular hypothesis for sex-dependent pain perception and treatment. A variety of analytical techniques have been applied, including artificial intelligence (AI) methods selected from its currently most widely used subfield, machine learning. Overviews of the goals and applications of machine learning in pain research have been provided elsewhere 10,13 . In the present analyses, the focus was on detecting relevant structure in the data consistent with sex segregation. This was addressed using unsupervised and supervised methods. The main difference is the prior class labeling of the cases, i.e., in unsupervised analysis, the data structure is detected without knowledge of the class structure. In supervised methods, the algorithms are given training data, i.e., variables obtained from the cases, to learn the assignment of a case to a particular class. The success of this learning process is supervised because the class labels of the training samples are known. Only later, the trained algorithms have to prove their learned skill on new cases, using the same kind of variables they were trained with to assign a case to a particular class. Perhaps a school analogy will make this clearer: during learning, the "teacher/supervisor" (researcher/programmer) supervises the learning progress of the algorithms based on the knowledge of the correct result. Only when the task has been successfully learned can the algorithm be trusted to assign the correct class label (in this case, male/female) to new cases. These were available as a 20% validation sample, separated from the data before the algorithms were trained (see below). Both classes of analysis benefit from eliminating unnecessary information from relevant content, which is a standard part of machine learning workflow referred to as feature selection 14 . In the present analysis, several methods from both classes have been applied to obtain robust and internally validated results not relying on a single method.
The programming was performed in the R language 15 using the R software package 6 , version 4.2.2 for Linux, available from the Comprehensive R Archive Network (CRAN) at https:// CRAN.R-proje ct. org/, and in the Python language 16 using Python version 3.8.13 for Linux, available free of charge at https:// www. python. org. Analyses were performed on 1-64 cores/threads on an AMD Ryzen Threadripper 3970X (Advanced Micro Devices, Inc., Santa Clara, CA, USA) computer running Ubuntu Linux 22.04.1 LTS (Canonical, London, UK), except that deep learning neural networks were trained on an NVIDIA GeForce RTX 3060 graphics processing unit (GPU) (NVIDIA Corporation, Santa Clara, CA, USA). Data preprocessing. Data preprocessing consisted of log-transformation and imputation of censored data described in full detail in the supplementary materials. This provided a 125 × 11 sized input data space, X, of pain thresholds (125 subjects, 11 variables; Fig. 2), while the output data space, y, consisted of the binary sex information.
Unsupervised analyses. Unsupervised methods were applied to analyze the univariate modal distribution of variables and cluster structures in the multivariate data space. Unsupervised methods included Gaussian mixture modeling for univariate data, five different and projection methods for multivariate data (principal component analysis (PCA) 17 , independent component analysis (ICA) 18 , multidimensional scaling (MDS) 19,20 , isomap 21 and t-distributed stochastic neighborhood embedding (t-SNE) 22 ) and seven different clustering methods (k-means clustering 23 , partitioning around medoids (PAM) 24  www.nature.com/scientificreports/ Supervised analyses were performed in which machine learning algorithms were trained to infer the sex of the subjects from pain-related information. Supervised methods included five different machine learning algorithms (support vector machines (SVM) 26 , random forests 27,28 , logistic regression 29 , linear discriminant analysis (LDA) 30 and a "deep learning" 31 neural network 32,33 ). Set-up and results of supervised analyses are described in detail as follows.

Supervised analyses.
Supervised structure detection assumed that if a computational algorithm can be trained with pain threshold data to assign a subject to the correct class, i.e., sex, so that it can infer the sex of new subjects from their pain threshold pattern, then the pain thresholds contain information that is actually relevant to the sex of the subjects.
Supervised analyses were performed using cross-validation and training/test/validation splits of the data set. Specifically, prior to feature selection and classifier training, a class-proportional random sample of 20% of the dataset was set aside as a validation sample that was not touched further during algorithm training and feature selection. The 80/20% split of the data set was done in an automated manner using our R package "opdisDownsampling" (https:// cran.r-proje ct. org/ packa ge= opdis Downs ampli ng), which selected from 10 6 random samples the one in which the distributions of the variables were most similar to those of these variables in the complete data set 34 . The remaining 80% of the dataset were considered as training sample, in which feature selection and classifier training were performed, using further training/test splits of this "training" sample in cross-validation settings; however, without touching the 20%-validation sample.
Feature selection. Feature selection aimed at filtering the "needles" from the "haystack", i.e., identifying the relevant pain threshold variables for the sex differences. Several feature selection methods were used 14 available in the Python package "scikit-learn" (https:// scikit-learn. org/ stable/ 35 ). Univariate methods were used that are based on the false positive rate test and the family wise error rate. Multivariate feature selection methods were used that are based on calculation of an importance estimate for each variable following training of four different classification algorithms (SVM, random forests, logistic regression, LDA). After training of the algorithms, the most relevant variables were selected using methods from the "sklearn.feature_selection" module of scikit-learn. This included the F value based feature selection as implemented in the "SelectKBest" method. The number k of the features to be selected was determined by a grid search of [1,…,11] variables. Further methods comprised "SelectFromModel" (SFM), which selects features based on importance weights in the trained algorithm, recursive feature elimination (RFE), which selects features, by recursively considering smaller and smaller feature sets and generating a feature ranking, forward and backward sequential feature selection (SFS), which iteratively find the best features by adding features to a set of initially zero and all features, respectively. Furthermore, the generic permutation importance provided in the "permutation_importance" method of the "sklearn.inspection" package was used, setting the number of permutations to n_repeats = 50. The feature selection methods were applied in a 5 × 20 nested cross-validation scenario provided with the "RepeatedStratifiedKFold" method from the "sklearn. Because for cold stimuli higher threshold values mean higher pain sensitivity whereas for mechanical and heat stimuli, lower threshold values mean higher pain sensitivity, the direction of differences seems inconsistent. Therefore, the columns are colored for the direction of sex differences. (C) Correlation matrix of pain thresholds. Each cell is colored according to the correlation coefficient and labeled with the respective p value. The figure has been created using the R software package (version 4.2.2 for Linux; http:// CRAN.R-proje ct. org/ 6 ) and the R library "ggplot2" (https:// cran.r-proje ct. org/ packa ge= ggplo t2 7 ). www.nature.com/scientificreports/ model_selection" module of "scikit-learn", setting the parameters "n_splits" = 5 and "n_repeats" = 20. All classifiers used during feature selection were tuned with respect to relevant hyperparameters. With the two univariate methods and the six multivariate methods run with four classification algorithms, the total number of feature selection methods summed to m = 26. In the resulting 11 × 26 feature matrix, i.e., 11 pain threshold variables versus 26 selection methods, initially filled with zeros, each feature received a value of 1 when selected. This resulted in a sum score for the number of selections for each pain-related variable. The final feature set was obtained from the majority vote of the 26 selection methods. Therefore, the sum score was submitted to computed ABC (cABC) analysis 36 , an item categorization technique that divides a set of positive numerical data into three disjoint subsets labeled "A" to "C". Subset "A" contains the "important few", which were retained as "reduced" feature sets, whereas subset C" contains the "trivial many" 37 . The Python implementation is available as our package "cABCanalysis" at https:// pypi. org/ proje ct/ cABCa nalys is/ 38 .
Classifier performance evaluation. The main classifiers were imported from the Python package "scikit-learn" (https:// scikit-learn. org/ stable/ 35 ). SVM, implemented as "linearSVC" and random forests were selected as two commonly used classification algorithms of different types, i.e., class separation using hyperplanes in data projected to higher dimensions, or class separation using an ensemble of simple decision trees. In addition, logistic regression and LDA were included as classical methods for class assignment. For comparison, a "deep learning" 31 neural network 32,33 was used to include a completely different type of classifier, using the GPU-only version of the "TensorFlow" machine learning platform (https:// www. tenso rflow. org 39 ) for Python.
The classifiers were tuned to the respective training data set using a grid search approach for different combinations of hyperparameters. For example, the tuning of the algorithms led to the selection of ridge regression as the regularization method for SVM and logistic regression, to a forest size of d = 500 trees with a maximum depth of 1 decision for random forests. For the deep neural network, after grid searches including the number of layers, number of neurons, several activation functions provided in "TensorFlow", the number of hidden layers was set to h = 2, the number of neurons in layer 1 was set to int((n features + 1)/0.5) and the number of neurons in layer 2 was set to int((n features + 1)/1). A dropout layer with a rate of 0.2 was added after each layer to reduce possible overfitting. The activation functions were linear between the hidden layers and sigmoidal for the binary (men = 0, women = 1) output layer (Fig. 3). Binary crossentropy was set as the loss function, and 1000 epochs were used to train the neural network.
After feature selection, it was tested whether the selected pain threshold variables actually provided sufficient information for sex separation in a sample that was not available during classifier tuning and feature selection. Therefore, the included algorithms were trained with the full and reduced feature sets in a 5 × 20 nested crossvalidation scenario, using randomly selected subsets of 67% of the original training dataset. The trained classifiers were then applied to random subsets comprising each 80% of the validation dataset that had been separated from the full original dataset prior to feature selection and classifier tuning. For the reduced feature sets, hyperparameter tuning was repeated prior to classifier performance evaluation. Balanced accuracy was used as the main parameter to evaluate the classification performance 41 . Additionally, the area under the receiver operating characteristics curve (roc-auc) 42 was calculated. To control possible overfitting, all machine-learning algorithms  Fig. 4 as "reduced" feature set (B) (darker blue columns Fig. 4A), or d = 2 variables that had resulted from further narrowing the feature set to the "sparse" feature set (C) (darkest blue columns Fig. 4A). Variables X 1..11 are the input variables, H 1..n denotes the hidden layers, and the binary output layer is shown on the right. In addition, the respective activation functions are given on the top of each layer. The graph shows the neural networks architectures as implemented with "TensorFlow" and run on an NVIDIA GeForce RTX 3060 GPU (NVIDIA Corporation, Santa Clara, CA, USA). The figure was created using Python version 3.8.13 for Linux (https:// www. python. org), with the seaborn statistical data visualization package (https:// seabo rn. pydata. org 40 ) and Python code modified from https:// towar dsdat ascie nce. com/ deep-learn ingwith-python-neural-netwo rks-compl ete-tutor ial-6b53c 0b06a f0. www.nature.com/scientificreports/ were trained with pain data in that each variable was randomly permuted, with the expectation that a classifier trained with this information should not perform better than guessing, i.e., give a balanced accuracy (or a rocauc) around 50%, else, overfitting could not be ruled out entirely. Furthermore, using a validated approach to the most informative features 43 , classifiers were also trained with the unselected features, i.e., the pain threshold variables that were kept as the result of feature selection. This provided further support that the feature selection had indeed identified the key informative variables.
Hypothesis transfer to the level of molecular pharmacology. From the machine-learning based analysis of the sex-related information contained in pain thresholds, a molecular hypothesis for sex-dependent pain perception and treatment could be derived that narrowed the molecular background on nociceptors for mechanically but not thermally induced pain. This was addressed via knowledge retrieval from publicly accessible databases. Relevant nociceptors were identified by domain expert knowledge and classical literature search among biomedical publications in the PubMed database. The evidence on the molecular background of pain induced by mechanical stimuli provided a basis for potential translation into precision pain medicine approaches.
Ethics approval and consent to participate. The study from which the data set originates followed the  The red lines show the boundaries between the ABC subsets "A", "B" and "C". Category "A" with d = 4 variables is considered to include the most relevant variables for class discrimination. (C) Result from a second cABC analysis performed on the results of the first analysis (recursive cABC analysis). The figure was created using Python version 3.8.13 for Linux (https:// www. python. org), with the seaborn statistical data visualization package (https:// seabo rn. pydata. org 40 ) and our Python package "cABCanalysis" available at https:// pypi. org/ proje ct/ cABCa nalys is/.

Results
Unsupervised analysis using five different common data projection methods and seven different common clustering algorithms ended up with a heterogeneous picture about the data structure and its correspondence to the prior classes. The respective results are reported in the supplemental information.

Sex separation achieved with machine-learning algorithm after supervised training. Pain
threshold variables relevant to sex segregation. Among the 26 approaches to feature selection, pain threshold for blunt pressure stimuli was selected by a majority of 22 approaches (Fig. 4). Item categorization obtained by applying cABC analysis to the sum of votes per approach identified d = 4 variables as belonging to category "A", i.e., the "important few" according to the classical nomenclature for ABC analysis 37 , namely pain thresholds to blunt pressure, punctate pressure with and without hypersensitization by capsaicin cream, and electrical current (Fig. 4). To further narrow the focus to the most relevant variables, a recursive cABC analysis was performed, which consists of repeating the cABC analysis for the items in category "A" of a precedent cABC analysis. This provided a "sparse" feature set comprising d = 1 variable, i.e., the pain threshold to blunt pressure (Fig. 4).
Ability of algorithms to identify a person's sex based on pain sensitivity. When the algorithms were trained with the full pain threshold information (d = 11 variables) from 80% of the study subjects, they identified sex in the other 20% of the subjects with a median balanced accuracy of up to 79% (Table 1). This remained almost unchanged when the algorithms were trained with the d = 4 pain thresholds of the "reduced" feature set. Even with only d = 1 pain threshold of the "sparse" feature set, i.e., pain thresholds to blunt pressure, the balanced accuracy was up to 79%. Table 1. Performance of different classifier types (linear support vector machine, SVM, random forests, logistic regression, linear discriminant analysis, LDA) in assigning sex in the validation data set. The classifiers were trained in a 100-fold nested cross validation setting with subsets randomly drawn from the 80% training data set separated at the beginning of the analyses. Training was performed with all variables (d = 11 painthreshold related variables) as "full" feature set, d = 4 variables that had resulted from the feature selection steps shown in Fig. 4 as "reduced" feature set (darker blue columns Fig. 4A), or d = 2 variables that had resulted from further narrowing the feature set to the "sparse" feature set (darkest blue columns Fig. 4A). In addition, the sex assignment task was repeated using permuted data for algorithm training or using the unselected features, i.e., the pain-threshold information that was not found to be informative for sex inference (light blue columns in Fig. 4A). The numbers are the medians of classification performance on the 20% validation data separated before feature selection and classifier training from the whole data set and not used for feature selection and algorithm training. Shown are the medians and nonparametric 95% confidence intervals (2.5th to 97.5th percentiles) from 100-fold cross-validation runs. www.nature.com/scientificreports/ By contrast, when training the algorithms with the unselected pain thresholds, to which belonged the thermal pain thresholds and the effects of sensitization, they could not identify the subjects' sex better than by guessing. Classification performance was then equivalent to that obtained when permuted data, i.e., nonsense information, was used for training (Fig. 4). This underscores that the informative pain threshold information in the sex context had been indeed identified during feature selection (Fig. 4).
Possible molecular-pharmacological implications. Evidence identified by literature search for nociceptors involved in the perception of pain elicited by mechanical but not thermal noxious stimuli, nor by the hypersensitization induced by capsaicin and menthol, pointed to d = 13 molecular targets ( Table 2). Information retrieval from the DrugBank database provided a 79 × 13 matrix of drugs versus nociceptive targets (Fig. 5). Most of the 79 drugs were calcium channel blockers targeting cardiovascular diseases and might be of interest for drug repurposing to target pain. Of the in total 14,594 unique entries in the DrugBank database, 229 drugs were annotated with a therapeutic indication that contained the substrings "pain" or "nocicep" or "analge". Of those, 10 were among the 79 drugs in above matrix, namely tramadol, enflurane, methadone, chlorzoxazone, meperidine, levomenthol, ethanol, orphenadrine, cns-5161, ketobemidone. Most are well-known substances. Of the lesserknown ones, orphenadrine is mentioned in the DrugBank database as a muscarinic antagonist used to treat drug-induced parkinsonism and to relieve pain from muscle spasm, and CNS-5161 is a N-methyl-d-aspartate ion channel antagonist.

Discussion
The application of machine learning principles 59 allowed the inversion of the association of sex and pain sensitivity. A median balanced accuracy of 74-79% can be considered as good according to the usual criteria for machine learning classifiers. This strongly indicates that a person's sex is highly relevant in pain research and probably in pain treatment. The analyses showed that sex segregation is not evenly distributed across the pain thresholds to different noxious stimuli. Positive evidence was provided that sex can be inferred from mechanical thresholds and negative evidence that it cannot be inferred from thermal thresholds or from the effects of capsaicin/menthol induced hypersensitization. The "not" logic, based on consideration of the unselected features 43 , is likely to hold as well to other machine learning algorithms than the presently used selection, which already covers various architectures up to "deep learning". None of the algorithms showed any tendency to leave the guessing range when trained with the unselected features, i.e., thermal pain thresholds or sensitization effects (Fig. 6).
Implications for pain precision medicine. The logic of "mechanical but not thermal" can be directly translated to the molecular level by focusing on pathways. Three main types of nociceptors are expressed in the skin: mechanosensitive and mechanothermal nociceptors, which signal mainly through myelinated A δ -nerve fibers, and polymodal nociceptors, which are connected to unmyelinated C-nerve fibers 60 . Mechanical and thermal stimuli elicit pain through different pathways; for example, neurons activated by the peripheral application of heat differ from those activated by noxious pressure 61 . Mechanoreceptors and mechano-transducers include Table 2. Molecular sensors of mechanical pain for which no positive evidence was found that they also transmit thermal nociception, or for which negative evidence has been found with respect to an involvement in thermal nociception. The gene symbols used in the original publication are given along with the actual symbols retrieved via the NCBI number using the R library "org.Hs.eg.db" https:// bioco nduct or. org/ packa ges/ relea se/ data/ annot ation/ html/ org. Hs. eg. db. html 47 ). www.nature.com/scientificreports/ ion channels, specialized cytoskeletal proteins, cell junction molecules, and G protein-coupled receptors. There is growing evidence that mechanical stimuli play a role in pathological pain conditions [62][63][64] . Ion channels that have been shown to be involved in the perception of mechanical stimuli include the socalled piezo-receptors, PIEZO1 and PIEZO2, which are currently the largest known ion channels with three identical subunits of approximately 2500 amino acids each and are highly conserved in vertebrates 65 . These ion channels control the uptake of cationic ions, particularly Ca 2+ , in response to a stimulus, allowing them to mediate action potentials and intracellular signal transduction through second messenger signaling 66 . Ca 2+ entry mediated by mechanosensitive ion channels is also tightly regulated by a variety of key proteins distributed in the cell membrane or endoplasmic reticulum. Recent studies have shown that mechanosensitive ion channels can even physically interact with Ca 2+ -regulatory proteins, and these interactions have far-reaching implications for physiology and pathophysiology 67 . n Figure 5. Performance of different classifier types (linear support vector machine, SVM, random forests, logistic regression, linear discriminant analysis, LDA) in assigning sex in the validation data set. The classifiers were trained in a 100-fold nested cross validation setting with subsets randomly drawn from the 80% training data set separated at the beginning of the analyses. Training was performed with all variables (d = 11 painthreshold related variables) as "full" feature set, d = 4 variables that had resulted from the feature selection steps shown in Fig. 4 as "reduced" feature set (darker blue columns Fig. 4A), or d = 1 variable that had resulted from further narrowing the feature set to the "sparse" feature set (darkest blue column Fig. 4A). In addition, the sex assignment task was repeated using permuted data for algorithm training or using the unselected features, i.e., the pain-threshold information that was not found to be informative for sex inference (light blue columns in Fig. 4A). The boxes show the 25th, 50th and 75th percentiles balanced accuracy (BA) and roc-auc for the classification performance in the 20% validation data separated before feature selection and classifier training from the whole data set and not used for feature selection and algorithm training. Whiskers span the 95% confidence interval from the 2.5th to the 97.5th percentiles. The vertical dashed red lines mark the 50% guessing level that must not be touched by the confidence interval of the classification performance measures if the algorithm can be considered successfully trained. The vertical dotted blue lines mark the respective best median classification performance observed across all data set variants and algorithms. The figure has been created using the software package R (version 4.2.2 for Linux; https:// CRAN.R-proje ct. org/ 6 ) and the R libraries "ggplot2" (https:// cran.r-proje ct. org/ packa ge= ggplo t2 7 ) and "ComplexHeatmap" (https:// bioco nduct or. org/ packa ges/ relea se/ bioc/ html/ Compl exHea tmap. html 58  www.nature.com/scientificreports/ The frequency of action potentials in nociceptors is a major determinant of pain intensity. Modulators include HCN ion channels, which generate an inward current following membrane hyperpolarization 68 . The human HCN2 channel is activated by hyperpolarization that exhibits weak selectivity for potassium versus sodium ions 69 . It contributes to native pacemaker currents in the heart and neurons. These channels have recently been linked to chronic pain. Blocking the HCN2 gene has been associated with the disappearance of neuropathic pain episodes 70 . In addition, HCN2 has been shown to interact with HCN1 71 and HCN4 71 by co-assembling homologous subunits to heteromeric complexes. One possible mechanism for the sex-specificity of mechanical nociceptors could be an allosteric interaction with estrogens. That is, while receptors are physiologically stimulated by endogenous orthosteric agonists, few naturally occurring allosteric modulators have been described 72 . Steroid hormones such as progesterone and estradiol exert allosteric effects on cannabinoid 73 , oxytocin 74 , and glycoprotein hormone receptors 75 , respectively, implying that receptors can be indirectly modulated by changing the concentration of the endogenous allosteric modulator 76 . Recently, neurosteroids and neuroactive steroids were shown to act as allosteric modulators of muscarinic receptors, which are a subfamily of G-protein receptors 77,78 and GABA receptors, a class of receptors that respond to the neurotransmitter gamma-aminobutyric acid (GABA), the major inhibitory compound in the mature vertebrate central nervous system 79 . However, the relevance of these allosteric mechanisms to sexspecific mechanical pain remains to be demonstrated, as corresponding evidence for mechanical nociceptors is still lacking in the literature.
Pharmacological application of allosteric modulation is conceivable with emerging approaches to drug design based on secondary binding site effects, where small molecule drugs are designed to bind into secondary binding sites on the targeted biomolecules 80 . According to the present line of reasoning, drugs addressing the molecular targets of mechanical nociceptive stimuli might act better on pain in women than in men and thus be suitable for sex-specific analgesic treatment. For the 10 drugs annotated with effects on pain, evidence for a better effect in women is sparse 81 . Nonetheless, the present explorative analysis illustrates the possibility that the data analysis approach presented in this report could be used for sex-specific precision medicine in pain. Certain analgesics could be preferred or future approaches to drug repurposing could be guided. Of course, this would also require the identification of clinical pain settings in which mechanical stimuli play a main role. To date, sex specific analgesia has been rarely implemented clinically. However, proof-of-concept is provided through the observation that κ-opioid agonists (pentazocine) are more efficient in women with the "red hair-white skin" phenotype based on variants of the melanocortin-1 receptor gene (MC1R) 82 .
While it seems accepted that females have lower pain thresholds for various forms of noxious stimuli 83 , the mechanistic basis of this sexual dimorphism in nociception is far from being fully understood. Transcriptome analyses of the whole dorsal root ganglion of female rats have revealed lower cutaneous mechanical nociceptive thresholds 84 . Several studies have shown that in antinociceptive procedures using thermal, chemical and electrical stimuli, opioids are generally more potent in male than in female rodents, and vice versa 85 . It has also been shown that temporal summation of heat pain is greater in females than in males, suggesting that central processing of nociceptive input may be upregulated in females 86 . Several further studies have investigated temporal summation to noxious mechanical stimulation and examined sex differences in temporal summation of mechanically evoked pain 87 . Their results indicate that temporal summation of mechanically evoked pain is higher in females than in males, is stimulus frequency dependent, and is centrally mediated 86 .
The absence of ovarian hormones caused an increase in mean arterial pressure and a decrease in paw nociception in female rodents. Ovarian hormones play an important role in the mechanical threshold. The ovaries secrete two important hormones, estrogen and progesterone, which play a role in pain modulation 88 . Furthermore, many chronic pain conditions show sex differences in their epidemiology 89 . This may be due to sex-specific differential expression of genes involved in nociceptive pathways, including immune cells, sensory neurons, and the neuroendocrine system 90 . Elucidating how different cell types detect and respond to noxious stimuli may lead to new therapeutic insights and a better understanding of the basic mechanisms of pain plasticity.
Implications for pain data analysis. Seeking structure in data sets can be complex and the standard approach of perform a principal component analysis (PCA) to project data followed by group segregation statistics may not suffice always. In the present data set, if any structure and clusters could be detected in the data with unsupervised analyses (for a complete description of methods and results of unsupervised analyses, see Supplementary information), then after t-SNE (t-distributed stochastic neighborhood embedding) projection ( Supplementary Fig. 5). Supervised approaches to structures that reflected the sex structure in the present data provided more convincing results. However, it was not possible to determine with certainty in advance which method was successful in the present data set. This emphasizes that a complete analysis of pain-related data structures should include several methods.
Second, by using a machine-learning approach, the research question of sex differences in pain could be reversed by addressing the question whether pain can be used to infer sex. This allowed address negative evidence with the thermal pain thresholds directly. That is, the absence of positive evidence as provided in statistically non-significant sex differences in thermal pain thresholds could be extended to a presence of negative evidence by showing that sex inference from thermal thresholds was not better than guessing, similar to training the algorithms with "nonsense" information created by permutation of the variables. This emphasizes the need for negative evidence, regarded as evidence for non-involvement and not only as absent evidence for involvement, which is rarely provided in the scientific literature. For the present molecular targets (Table 2), explicit evidence for non-involvement in thermal pain was found only for HCN2 91 .
Finally, it should be noted that the present analyses cannot cover the full possible spectrum of AI and machine learning in pain research. While several different methods of data analysis were implemented, it is not possible www.nature.com/scientificreports/ to cover all (for an overview, see 58 ). Similarly, the deep learning neural network 32,33 was trained only in one of the discriminative forms (for an overview, see e.g. 59 ) while further architectures were not tested. However, without including other architectures, the utility of deep learning for datasets such as the present one cannot be conclusively judged.

Conclusions
Formulating the question of sex differences in pain as a machine learning problem allowed for an unbiased analysis that concluded that pain thresholds differ relevantly, to the extent that algorithms can be trained with pain threshold information from men and women that they are able to correctly infer sex in new subjects by up to 79% balanced accuracy. This underscores that sex cannot be ignored in pain research and probably in analgesic therapy. By addressing a subsequent hypothesis that included a "NOT" logic, machine learning indicated that sex-relevant information resides in the pain thresholds for mechanical but not thermal stimuli of responses to local hypersensitization. The hypothesis thus formulated could be translated into a molecular hypothesis about nociceptive targets. The present analytic approach highlights that a diverse composition of data science methods including machine learning algorithms can help decipher deep data structures relevant to biomedical and clinical professionals.

Data availability
Data are available from the first author on request.